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Abstract. In the last few years there has been an enormous effort in parameter 
estimation studies for different sources with the space based gravitational 
wave detector, LISA. While these studies have investigated sources of differing 
complexity, the one thing they all have in common is they assume continuous data 
streams. In reality, the LISA data stream will contain gaps from such possible 
events such as repointing of the satellite antennae, to discharging static charge 
build up on the satellites, to disruptions due to micro-meteor strikes. In this 
work we conduct a large scale Monte Carlo parameter estimation simulation for 
galactic binaries assuming data streams containing gaps. As the expected duration 
and frequency of the gaps are currently unknown, we have decided to focus on 
gaps of approximately one hour, occurring cither once per day or once per week. 
We also study the case where, as well as the expected periodic gaps, we have 
a data drop-out of one continuous week. Our results show that for for galactic 
binaries, a gap of once per week introduces a bias of between 0.5% and 1% in 
the estimation of parameters, for the most important parameters such as the sky 
position, amplitude and frequency. This number rises to between 3% and 7% for 
the case of one gap a day, and to between 4% and 9% when we have one gap a 
day and a spurious gap of a week. A future study will investigate the effect of 
data gaps on supermassive black hole binaries and extreme mass ratio inspirals. 
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1. Introduction 

The planned ESA-NASA Laser Interferometer Space Antenna (LISA) [TJ[2] will be the 
first space-based gravitational wave (GW) detector. It will consist of three identical 
free-falling spacecraft in an equilatoral triangle configuration, with each spacecraft 
separated by 5 x 10 9 m. The constellation orbits around the Sun, 20° behind the 
Earth, inclined at an angle of 60° to the ecliptic. It is expected that LISA will detect 
sources in the frequency range of 10~ 5 — 10 _1 Hz. This should allow us to detect 
and extract parameters for such sources such as galactic white dwarf binaries (GBs), 
supermassive black hole binaries (SMBHBs) to redshifts of z w 20 [3] H], extreme 
mass ratio inspirals (EMRIs) to approximately z = 1 5,6, kinks and cusps of cosmic 
superstrings and a stochastic background from the background of low mass black hole 
binaries [7]. 

In general, the sources detected by LISA will have a large signal to noise ratio 
(SNR). As the sources will be loud, an ideal method to detect and extract the 
parameters from the various sources is matched filtering 8 . This technique works 
by cross correlating the received signal output with theoretical waveform models or 
templates. As matched filtering is especially dependent on the phase information of 
the waveform, a high correlation between signal and template then allows us to make 
a prediction of the source parameters. 

A number of studies have already been conducted on parameter estimation for 
a range of different sources. While these works have focused on different source 
types, the one thing they have in common is, each study assumes uninterrupted data. 
In reality we know that communication between the spacecraft and Earth will be 
interrupted at various times during the mission. The interruptions are likely to come 
from such as events as antenna re- alignment, discharging any static charge build-up 
on the spacecraft, micro-meteor impacts or hardware failures either on the spacecraft 
or on Earth. The gaps in the data not only cause a loss of signal, but if not properly 
treated could generate, due to the discontinuity in the time domain, massive power 
leakage in the Fourier domain where the data analysis is conducted. This excess power 
would result in spurious detection statistics and parameter estimation. 

While a number of other works have focused on the implementation of time 
domain tapers for the termination of inspiral waveforms where we are not taking 
the merger and ringdown into account [H [9], to our knowledge, there are only a 
handful of works that have investigated the influence of data gaps on GW parameter 
estimation [lOj Qj]. In this paper we propose a first-step attempt to deal with 
the treatment of data gaps that are irregular in both duration and frequency, and 
investigate the effects on parameter estimation for white dwarf - white dwarf galactic 
binary systems. 

The paper is organized as follows. In Section[2]we present the response of the LISA 
detector in the low frequency approximation, the polarizations of the GWs and details 
how we treat the data gaps. In Section [3] we explain the setup of the Monte Carlo 
(MC) simulation and outline some of the terminology related to GW data analysis. 
Finally, in Section @] we present the results of our simulations. 

2. The LISA response and the treatment of gaps 

In this study, we chose to model the LISA response to an incoming GW in the low 
frequency approximation (LEA) [12]. It has been shown that the LEA provides a 
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2.1. The LISA response in the low frequency approximation 

The LFA response of a single LISA channel to an impending gravitational wave with 
both polarizations is 

h(t) = h+(t)F+(t)+h x (t)F*(t), (1) 

where the two polarizations of the gravitational wave h + _ x (t) are given for a non- 
chirping galactic binary by. 

h+ (t) =Ao(l + cos 2 i) cos ($ (t) + ip ) , (2) 

h x (t) = -2Aocost sin($ (t) +<p ) , (3) 

where Aq is a constant amplitude, t is the angle of inclination of the source's orbital 
plane with respect to the observer, and ipo is a constant initial phase. The time 
dependent phase of the gravitational wave, $(<), for a circular orbit is defined by 

$(t) = 2^/ (t + i? ffi sin 6 cos (27r/ m t - 0)) , (4) 

where /q is the quasi-monochromatic frequency, / m = 1/T yr is the LISA modulation 
frequency, Tj, r is the number of seconds in a year, is the light travel time across 
1 AU (~ 500 sees) and (6, (f>) represent the sky location of the source. The quantities 
F+' x (t) are defined in the LFA by [H] 

F+(t; V, 6, <t>,\) = \ [cos(2i>)D+(t; 9, 0, A) - sin(2V)# x (t; 9, cf>, A)] , (5) 

F x (i; V, 0, 0, A) = i [sin(2^)£»+(t; 0, 0, A) + cos(2^) J D x (t; 0, 0, A)] , (6) 

where -0 is the GW polarization angle of the wave and A = or 37r/2 defines the LISA 
A and E TDI channels. The detector pattern functions D +,x (t) are given by 

D + (t) = — -36sin 2 (60sin(2am-2A) + (3 + cos(2<9)) (7) 
64 . 

( cos(2(/)) | 9sin(2A) - sin(4a(i) - 2A)| + sin(2^) | cos (4a(t) - 2A) - 9 cos(2A)| 
-4V3sin(2e) ( sin(3a(i) - 2A - <j>) - 3sin(a(t) - 2A + < 



D x (t) = -^ %/3cos(0)(9cos(2A-20)-cos(4a(i)-2A-2</>)) (8) 
-6 sin(0) ( cos(3a(t) - 2A -</>)+ 3 cos(a(f) - 2A + 0)) , 

where a(t) — 2nt/T + n is the orbital phase of the center of mass of the constellation 
and k is the initial ecliptic longitude, which we take to be zero for this study. 
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Figure 1. Power spectrum of a response function composed of approximately 
one hour rectangular data gaps, once a day, over a period of one year. 



2.2. The treatment of data gaps 

It is believed that there will be gaps in the LISA data stream due to a number of 
different possible phenomena. While the frequency and duration of these gaps are 
at present unknown, we do know that we will need to somehow treat the gaps. For 
practical purposes, we can consider the gaps in the data as the product between the 
LISA response h(t), given by Equation (1), and a response function uj(t) consisting 
of a series of rectangular step functions, such that the gapped response is now 
h g (t) — h(t)uj(t), where 



Here t gap and t^ gap are the time when the gap starts and finishes respectively. If laser 
lock has been lost during the information down time, there will be also be a transient 
response directly after the gap which would have a duration time on the order of 
minutes [15] . Left untreated, a sudden discontinuity in the time domain data causes 
massive spectral leakage in the Fourier domain. This is due to the fact that the Fourier 
transform of the rectangular function uj{t) is the sinc-function w(/) = tsincijrft) 
which contains decaying frequency information as / — > oo. As an example, in Figure[T] 
we plot the power spectrum of a response function that corresponds to an untreated 
data gap of approximately one hour, once a day, every day for a year. We can see the 
high level of spectral leakage at higher frequencies in this response. We also notice 
that after the first few sidelobes, there is a very slow drop-off in the amplitude of the 
lobes, which contribute spurious power to higher frequencies. It is this excess power 
that we need to rid ourselves of as we search for GW sources. 




(9) 
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Figure 2. A comparison of the power spectra of different window functions for 
a seven hour long duration, with a gap of one hour. In each cell we also plot a 
rectangular response function as an example of the worst case function. 

Before continuing, we should comment here on the option of interpolating the 
data gap. It is common in many astronomical fields to try and interpolate the gapped 
data as short timescale data snippets may be taken over a long period of time (e.g. 
pulsar timing arrays). The danger for LISA, we feel, is the large number of sources 
potentially in the data set. If the data stream was composed solely of many millions of 
non-chirping galactic binaries, we could envisage a safe interpolation of the data as it 
would merely contain a superposition of overlapping sinusoids. However, we could also 
imagine a more realistic scenario where, just before the gap, there is a supermassive 
black hole coalescence. In order to interpolate the data, we need to use information 
from both sides of the gap. The fact that we have a large finite signal ending just 
before the gap could massively distort the interpolation data leading to a spurious 
source detection in the gap. We therefore have decided to err on the safe side of 
the "garbage in, garbage out" idiom, and not interpolate the data. We will instead 
focusing on using a window function to smoothen the discontinuity and reduce the 
spectral leakage caused by the gaps in the time domain stream. 

A perfect window function would have a narrow central peak (for high resolution) , 
a low first sidelobe (indicating good noise suppression) and a rapid fall off of sidelobes. 
However, in practice, this is difficult to achieve as filters with a narrow central peak 
may have a slow fall off in sidelobes, while a window with rapid fall off in sidelobes may 
have a wider central peak. Therefore, the goal is to strike a balance. We investigated 
a number of different window functions, as can be seen in Figure [21 where we present 
the power spectra of a number of different windows. In each case the time series had 
a duration of seven hours with a one hour gap after approximately three hours. In 
the end we chose the Hann function which strikes a nice balance between frequency 
resolution and noise suppression. For example, the rectangular window has its first 
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zeros at ±1 bins, but the highest sidelobe is only -13.3 dB below the central peak and 
is located at 1.43 bins. Furthermore, the sidelobes fall off as The Blackman- 

Harris window has a wider central peak with the first zero at ±4 bins, but with a 
lower first sidelobe, -92 dB below the central peak located at 4.52 bins. However, for 
this function, the sidelobes also fall off as In contrast the Hann window has its 
first zero at ±2 bins. The highest sidelobe is at -31.5 dB and is located at ±2.36 bins. 
Finally, the sidelobes fall off as /~ 3 . Therefore, with the Hann window we have an 
acceptable widening of the central peak, a low first sidelobe and the sidelobes fall off 
at a rapid rate. The Hann window used to treat the data gaps is defined as 

„+2t„ 



1 



t> 



< t < ti 



w{t) = < 



i 



L gap — — L gap 



tgap < i < tgap + twin 



+ t u 



t < t* — ti t > +f 
L L gap L wim L ^ L gap ^ "Win 

where t W i n is the time necessary for the window function to go from 1 to 0, and 
conversely. For this particular source, we experimented with a number of different 
values of t W i n . We found that t w i n = 10 minutes was sufficient in smoothening the 
rectangular window in order to reduce spectral leakage to acceptable levels. 



.(10) 



3. The Monte Carlo Setup 



To evaluate the fidelity of one of our templates in the search for a GW signal, we 
define the noise-weighted inner product between a template h(t) and a signal s(t) as 



(h\s) 



Hf)s*(f) + h*(f)S(f) 

Sn(f) 



df 



(11) 



where h(f ) = f^L h(t) exp(— 2nft)dt is the Fourier transform of h(t), h*(f) is the 

complex conjugate of h(f) and S n (f) = S™ str (f) + S 1 ™"^ is the noise power spectral 
density of the LISA detector. This quantity is the sum of instrumental noise and 
confusion noise from the background of unresolved galactic binaries. The instrumental 
noise for LISA is given by [16] 

J*. 

- 1 JL 

j. 



2sr 



+8Sr(Z) l + cos^ m- 




(12) 



2vrl0- 



(2^/) e 



where L = 5 x 10 6 km is the LISA arm length, SP° s (f) = 4x XQ- 22 m 2 H 7- 1 and 
S^ cc (f) = 9x 10 _30 m 2 s _4 i/z _1 are the position and acceleration noises respectively 
and /* = 1/ (2irL) is the mean transfer frequency for the LISA arm. For the galactic 
confusion noise, we use the following form derived from a Nelemans, Yungelson, Zwart 
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galactic foreground model [TTl [18] 

- 10 -44.6 2/ -2.3 1Q -4 < f< 10-3 
10 -50.9 2/ -4.4 1Q -3 < J < 10 -2.7 

= <| , (13) 

10 -62. 8/ -8.8 1Q -2.7 <• J < 1Q -2.4 

io- 89 - 68 /" 20 io- 2 - 4 < / < io- 2 

where the confusion noise has units of Hz^ 1 . 

Using the inner product in Equation (fTTj) , we define the SNR between signal and 
template as 

'-Mr 

and the overlap between the two waveforms in each channel as 

0= . as) 

V(h\h)(s\s) 

To conclude, we introduce a final tool, the Fisher information matrix (FIM) given by 



(16) 



F = 

the inverse of which is the variance- covariance matrix. In the limit of high SNR 
(which is almost always the case for LISA), the diagonal elements of the variance- 
covariance matrix provides an estimation of the variance in the parameter estimation. 
In the case of monochromatic GBs, our system is defined by the parameter set 
\ = {ln(A ),i,-$,(p ,ln(f ),cos(9),<l>}. 

In order to quantify the effects of the data gaps on GB parameter estimation, we 
ran a full parameter Monte Carlo with 10,000 sources using a threshold of p > 5. We 
chose this limit because it will be difficult at anything lower than this level to discern 
between a signal and noise. The duration and the frequency of planned gaps in the 
data are not known for the moment, so we choose a duration of one hour and we tested 
two gap frequencies : once per day and once per week. A gap duration of one hour was 
chosen as this should be sufficient to carry out operations on the spacecraft and regain 
laser lock, if lost during the disruption. If the duration of all gaps is the same, and the 
gap frequency is constant during the observation, this adds (true) power to the two 
corresponding frequencies in the Fourier domain. To minimize this power we modulate 
the duration and the frequency with Std ur = ±15min and Stf req = ±5/i, respectively. 
As a monochromatic GB's signal is essentially constant over many years, the recovered 
SNR and parameter estimation are dependent on the duration of observation. To test 
the effects of the gaps on both short and longer timescales, we ran MCs for each gap 
frequency assuming mission lifetimes of both one and three years. 

The parameter limits for the MC were chosen such that the amplitude lies within 
the range 10~ 23 < A < 10~ 20 , the monochromatic frequency of the binary was 
bounded by 10 -4 < fo/Hz < 5 x 10~ 3 and all other parameters drawn from within 
their natural limits. In order to quantify the effects of the data gaps, we calculated 
the SNR and estimated parameter errors for cases with and without gaps. While the 
concept of a global overlap over both detector channels is a little ambiguous, it is 
possible to gain some idea of a goodness of fit by investigating the overlaps between 
the gapped and ungapped data in each channel. 
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4. Results 

We conducted MC simulations for observation times of both one and three year 
periods, and for two gap frequencies of once per day and once per week. In each 
run, for 10,000 sources with p > 5, we investigated recovered SNRs and parameter 
uncertainties with and without gaps. We also ran a final case where we assumed a 
gap frequency of once per day over a period of one year, but with an added spurious 
data drop out that lasted a week in duration. To quantify information loss due to 
the gaps, we calculated overlaps between the waveform with gaps and the waveform 
without gaps in the A and E TDI channels. In Figures [3] and U we present our results 
for the two observation periods. For GBs, while the system is defined by a seven 
parameter set, only a subset are important for astronomical purposes. Therefore, 
each of the graphs are organised in the following way. On the top row, we present the 
fractional error in the amplitude, the error in the inclination of the binary plane and 
the fractional error in the binary frequency. On the bottom row, we present the sky 

resolution Ail = 2ir y - (£^) 2 (where are the components of the inverse 
FIM and the sky resolution has units of steradians) , the optimal SNR and finally, the 
overlaps between the sources with and without gaps, for the LISA A and E channels. 
In each cell describing the parameter errors and SNR, the black curve corresponds 
to no data gaps, the red dashed line denotes the case of one gap per week and blue 
dot-dashed line represents the case of one gap per week. In the final cell, the dark 
lines (solid and dashed) represent the overlaps between gapped and ungapped data 
in the case of one gap per week, while the orange lines (solid and dashed) denote the 
case of one gap per day. 

The first thing that is observable from both figures is that it is very difficult to 
visually make out a difference between the three cases for the estimation of parameters 
and recovered SNRs. What is obvious is that, for both one and three year observation 
periods, the median overlap between a gapped template and an ungapped template 
drops from almost 99% in both channels for one gap a week, to almost 97% for one gap 
a day. This demonstrates that the more frequent gaps do have an effect on parameter 
estimation which is not picked up visually in the other cells. 

In Table 1 we present the median error prediction in the estimation of parameters 
for both the one and three year observation periods for all data gap frequencies. In 
the case of a one year observation period with a gap of one hour, once a week, the 
effect of the data gaps is the introduction of a bias of between 0.5-1% error in the 
estimation of the amplitude, frequency and sky resolution. Once we go to one gap per 
day, this error increases to between 3-7% depending on the parameter. The inclusion 
of a spurious one week gap increased the bias to between 4-9% for the important 
parameters. The decrease in recovered SNR follows a similar pattern. Once we go to 
the three year mission case, the situation does not change with the estimated biases 
remaining almost constant. This is not really a surprising result as the monochromatic 
binary essentially outputs a constant power during the mission lifetime. The effect 
of the data gaps is the elimination of a small number of data points per observation 
period. This means that the total number of missing data points scales with, and is 
compensated for, by the observation period. We do not expect this to be the case for 
sources with significant frequency evolution, as more and more data points are lost 
within each subsequent gap. 
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Figure 3. Normalized histograms for distributions in the fractional error in the 
amplitude, the error in the inclination, the fractional error in the binary frequency, 
the sky resolution, the optimal SNR and overlaps between the sources, with and 
without gaps, for the LISA A and E channels assuming an observation time of one 
year. In the five first cells, going left to right, top to bottom, we plot the results 
without gaps (black curve), with one gap per week (red dashed curve) and with 
one gap per day (blue dot-dashed curve). In the last cell we plot the overlaps 
for channels A and E for one gap per week (dark curves) and one gap per day 
(orange curves). Note that the A and E channels are indistinguishable for each 
simulation. 

5. Conclusions 

In this study, we have investigated the effects of data gaps on the estimation of 
parameters for galactic binaries with LISA. We have demonstrated that left untreated, 
the data gaps cause massive spectral leakage in the Fourier domain, where the majority 
of the data analysis is conducted. This spectral leakage would have a large effect on 
the estimation of parameters and could in fact lead to spurious detections. Using 
a Hann window function, we smoothen the discontinuities in the time-domain data 
stream in order to reduce the leakage as best as possible. 

In order to prove the validity of our window function, we then ran a 10 5 iteration 
Monte Carlo simulation to investigate the estimation of parameter errors using a 
complete data set, a gapped but untreated data set and finally and gapped and 
treated data set for both one and three year LISA missions. As the frequency and 
duration of the gaps are presently unknown, we simulated scenarios where we have 
one gap per day and one gap per week, where each gap has a 60 ± 15 minute duration. 
We found that a gap of once per week introduces a bias of between 0.5% and 1% 
in the parameter estimation and allows us to recover more than a 99% correlation 
with a perfect data set containing no gaps. However, if we have to contend with 
one gap per day, the correlation with a perfect data set drops to about 97% and the 
parameter error increases by 3% to 7% depending on the parameter being investigated. 
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Figure 4. Same as Figure[3]but with an observation time of three years. 



Parameter 


1 year 




1 year 




1 year 




1 year 






No gaps 




1 / week 




1 / day 




1 / day + 1 week 


A A/ A 


1.165803 x 10" 


-l 


1.172101 x 10" 


-l 


1.210735 x 10" 


-l 


1.222675 x 10" 


-i 


At/rad 


1.278432 x 10" 


-l 


1.284081 x 10" 


-l 


1.320791 x 10" 


-l 


1.333090 x 10" 


-l 


A* /rad 


1.877202 x 10" 


-l 


1.885523 x 10" 


-l 


1.944390 x 10" 


-l 


1.966201 x 10" 


-l 


A^o/rad 


4.220753 x 10" 


-l 


4.228617 x 10" 


-l 


4.339971 x 10" 


-l 


4.384876 x 10" 


-l 


A/0//0 


4.122052 x 10" 


-7 


4.139029 x 10" 


-7 


4.250480 x 10" 


-7 


4.289691 x 10" 


-7 


AO/ster 


7.379220 x 10" 


-4 


7.452740 x 10" 


-4 


7.861529 x 10" 


-4 


8.009114 x 10" 


-4 


SNR 


53.27678 




52.97586 




51.49511 




50.85469 




A 


1 




0.997110 




0.979505 




0.973763 




O e 


1 




0.997109 




0.979511 




0.973634 




Parameter 


3 years 
No gaps 




3 years 
1 / week 




3 years 
1 / day 








A A/ A 


6.711915 x 10" 


-■> 


6.752622 x 10" 


-'2 


6.973013 x 10" 


-2 






At/rad 


7.366606 x 10" 


-2 


7.401683 x 10- 


-2 


7.613257 x 10" 


-2 






A*/rad 


1.012134 x 10" 


-1 


1.016863 x 10" 


-1 


1.048081 x 10" 


-1 






At^o/rad 


2.128610 x 10" 


-1 


2.137969 x 10- 


-1 


2.204804 x 10" 


-1 






A/0//0 


5.713411 x 10" 


-8 


5.737847 x 10" 


-8 


5.898288 x 10" 


-8 






Afi/ster 


1.796017 x 10" 


-4 


1.810762 x 10" 


-4 


1.915378 x 10" 


-4 






SNR 


92.38947 




91.90973 




89.25755 








Oa 


1 




0.997093 




0.979487 








O e 


1 




0.997093 




0.979488 









Table 1. Median parameter uncertainties, SNRs and single channel overlaps for 
galactic binaries assuming different data gap frequencies, and mission timcscalcs 
of one and three years. 
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We finally simulated a scenario where, as well as the gap of once per day, we also 
had a spurious week long data gap. In this particular case the bias in parameter 
estimation increases to between 4% and 9% for the most important parameters. The 
overwhelming conclusion for galactic binaries, is that in order to reduce any bias in 
the parameter estimation due to the gaps to under 1%, we require a gap frequency of 
no more than once per week. In the future we plan to conduct an investigation of the 
effect of gaps on sources such as the inspirals of supermassive black hole binaries and 
EMRIs where we feel the effect of the gaps may be more pronounced. 
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